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' Abstract 

We investigate the applicability of Quasi-Monte Carlo methods to Eu- 
^ ^. clidean lattice systems for quantum mechanics in order to improve the 

1^ ■ asymptotic error behavior of observables for such theories. In most cases 

' the error of an observable calculated by averaging over random observa- 

tions generated from an ordinary Markov chain Monte Carlo simulation 
behaves like N~^^^, where A*' is the number of observations. By means 
^ I of Quasi-Monte Carlo methods it is possible to improve this behavior for 

■ certain problems up to A'^"^. We adapted and applied this approach to 

simple systems like the quantum harmonic and anharmonic oscillator and 
, verified an improved error scaling. 

\q 

(3 ! 1 Introduction 

m 

Markov chain-Monte Carlo (Mc-MC) techniques are commonly the method of 
choice for the numerical evaluation of partition functions in statistical physics or 
path integrals in Euclidean time for models in high energy physics. The reason 
/\ . is that they are based on importance sampling and hence select the integration 

I points automatically according to the corresponding weight in the integrand. 

■ ■ ■ Many algorithms have been developed to implement a Mc-MC, starting from the 

Metropolis algorithm, heatbath and over-relaxation to cluster and hybrid Monte 
Carlo algorithms, see e.g. refs. [U |2- In this way, simulations of demanding 
4-dimensional quantum field theories became possible and, in fact, were carried 
out very successfully to e.g. compute the low-lying hadron spectrum ^ or 
deriving bounds for the Higgs boson mass [1]. 

The drawback of Mc-MC is that it estimates the desired quantity stochasti- 
cally and hence the results are affected by a statistical error which sometimes 
needs very long and computer time extensive samplings. Quantitatively, this 
sampling error behaves as 1/\/N for a (thermalized) sample size of N. This 
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error scaling behaviour is often a real stumbling block in such Mc-MC simula- 
tions. If we consider lattice quantum chromo dynamics as a typical system for 
Mc-MC calculations in high energy physics, then due to this error scaling and 
the very high computational demand of these simulations, it is often impossible 
to significantly decrease the error to the targeted precision. It would therefore 
be very desirable to have Monte Carlo methods available that possibly show a 
better error scaling. 

Such methods in fact exist in form of Quasi-Monte Carlo (QMC) techniques 
[5],|5], where it is known that the error scaling can be improved to an 
behaviour in the optimal case. However, although QMC methods have been 
analyzed theoretically very thoroughly and comprehensively, in practice they 
have found only limited application and, to the best of our knowledge, were 
never tested in models that are relevant for high energy physics. 

In this paper we therefore want to perform a very first step towards the 
very challenging goal of applying QMC methods to generic field theories by 
looking at the non-trivial case of the anharmonic quantum mechanical oscillator 
discretized on a finite Euclidean time lattice and evaluated in the corresponding 
path integral formulation [7]. For the case of the anharmonic oscillator the 
system is not Gaussian anymore and a successful application of QMC methods 
would be a first non-trivial test. Of course, even if such a test is successful, 
there is a long way to address eventually 4-dimensional quantum field theories, 
but a proof of concept would certainly open the promising road to attack field 
theories in the future. 

We will start our discussion with a description of the harmonic oscillator in 
its time-discretized path integral formulation. Here the problem is fully Gaus- 
sian and the application of QMC methods should lead to the optimal error 
scaling of l/iV, an expectation that we will see to be fulfilled. Nevertheless, the 
harmonic oscillator example can serve well to explain how QMC methods work 
and how this expected optimal error scaling behaviour is realized. 

We will then proceed to look at the anharmonic oscillator and we will demon- 
strate that also in this case QMC leads to an improved error scaling, although 
we will not be able to reach the optimal one. We consider this, nevertheless to 
be a very promising, non-trivial result which bears the potential that also other 
models in quantum mechanics, e.g. the topological quantum mechanical action 
of ref. [S] and even field theories can be evaluated by QMC methods. For a first 
account of our studies, we refer to the proceedings contribution of ref. [5]. 

2 Quantum Mechanical Harmonic and Anhar- 
monic Oscillator 

In this section we will discuss the basic steps for the quantization of the theory 
in the path integral approach and the discretization on a time lattice. The first 
step is the construction of the Lagrangian (resp. the action) of the corresponding 
classical mechanical system for a given path x{t) of a particle with mass Mq. 
For a numerically stable evaluation of the path integral it is essential to pass on 
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to Euclidean time. In this case the Lagrangian L and the action S is given by: 



Mq f dx\ 
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S{x) = / L{x,t) dt. (2) 
Jo 

Depending on the scenario (harmonic or anharmonic osciUator) the potential 
V{x) consists of two parts 

V{x) = ^x^ + ^ ' (3) 

anharmonic part 

such that the parameter A controls the anharmonic part of the theory. It should 
also be mentioned that in the anharmonic case the parameter fj? can take on 
negative values, leading then to a double well potential. 

The next step is to discretize time into equidistant time slices with a spacing 
of a. The path is then only defined on the time slices: 

t-^U^{i~l)-a i^l...d (4) 
x{t) Xi = x{ti) . (5) 

On the lattice the derivative with respect to the time appearing in ([T]) (first term) 
will be replaced by the forward finite difference \7xi = ^{xi+i —Xi). The choice 
of the lattice derivative is not unique and requires special care, particularly if 
one considers more complicated models like lattice QCD. But in [7] it was shown 
that the lattice derivative chosen here permits a well defined continuum limit. 
Putting all the ingredients together, we can write down the lattice action for 
the (an) harmonic oscillator 



-flatt 

i=l 
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For the path a cyclic boundary condition Xd+i — xi can be assumed. In the 
following the superscript "latt" will be dropped, as we will only refer to the 
lattice action from now on. The expectation value of an observable O of the 
quantized theory expressed in terms of the path integral reads as follows: 

{0{x)) = 4^ 0{x)e~s'^^'>dx^...dxa ^ 
/ijd e~'5'(^)da;i...da;d 

This expression is suitable for a numerical evaluation of certain quantities of 
the underlying theory. Up to now only Monte Carlo methods are known to give 
reliable results for dimensions d ^ 10. One type of such methods, often used in 
physics, is the Markov chain-Monte Carlo approach mostly applying the weight 
(X e""^*^^) for sampling paths {xi\ (so-called "importance sampling"). In the 
next sections, we will provide a summary of the mathematical results for the 
QMC methods and particularly recapitulate in a rather mathematical language 
the strict error scaling bounds for this method. The reader more interested 
directly in the results may move to section 7 directly. 
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3 Direct Monte Carlo and quasi— Monte Carlo 
methods 



In many practical applications one is interested in calculating quotients of the 
form ([7]) where the action S{.) and the observables 0{.) are usually smooth 
functions in high dimensions. In some special situations where one would like 
to deal with integrands of moderately high dimensions, one possible approach is 
to consider estimators /i,/2 for the integrals /i,/2 in the numerator and in the 
denominator of ([7]) separately, and then take /1//2 as an estimation of (0(x)). 
Another possible approach one can consider is given by the so called weighted 
uniform sampling (WUS) estimator, analyzed in [TU]. In the latter case, one take 
a joint estimator for the total quantity {0{x)), using a single direct sampling 
method. We will show some characteristics of the WUS estimator in section [71 
and we will refer from now on to the latter two approaches as direct sampling 
methods for estimating ([7]). In many interesting examples, we encounter the case 
were the action S{.) and the observable 0(.) lead to integrals /i,/2 of Gaussian 
type. Then the integrals /i,/2 can be written in the form 



/ 5i(x)e ^'^^c- ^x^^^ x= {xi,...,Xd), i = l,2 



(2^)'^/Vdet(C) V 

where C denotes the covariance matrix of the Gaussian density function. A 
transformation to the unit cube in IR.'' can be applied such that the corresponding 
integrals take the form 

/ g{A^-\z))dz^ [ fiz)dz^I^o.i]4f), z^{zi,...,zd). (8) 

J[0,1]'^ J[0,1]'' 

Here AA^ = C is some symmetric factorization of the covariance matrix, and 
^~^{z) :— ($^^(zi), . . . ,^~^{zd))'^ , where <I'^^(-) represents the inverse of the 
normal cumulative distribution function $(•). 

In the classical direct Monte-Carlo (MC) approach one tries to estimate ([5]) 
by generating samples pseudo-randomly. One starts with a finite sequence of 
independent identically distributed (i.i.d.) samples Pn — {zi, . . . , z^}, where 
the points Zj, 1 < j < N, have been generated from the uniform distribution in 
[0,1]'^. Then, the quadrature rule is fixed by taking the average of the function 
evaluations for / 

1 ^ 

as an approximation of the desired integral /j^ /(z) dz. The resulting es- 
timator Qat is unbiased. The integration error can be approximated via the 
central limit theorem, given that / belongs to i2([0, 1]*^). The variance of the 
estimator Qat is given by 



N 




f{z) dz 

[0,1]'' 




As measured by its standard deviation from zero the integration error associated 
with the MC approach is then of order 0(iV^ 2 ). The quality of the MC samples 
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relies on the selected pseudo-random number generators of uniform samples, 
here we use the Mersenne Twister generator from Matsumoto and Nishimura 
(see [H]). MC is in general a very reliable tool in high-dimensional integration, 
but the order of convergence is in fact rather poor. 

In contrast, quasi-Monte Carlo (QMC) methods generates deterministically 
point sets that are more regularly distributed than the pseudo-random points 
from MC (see [S], [H], [13], [B]). Typical examples of QMC are shifted lattice 
rules and low-discrepancy sequences. In order to give a short introduction to 
the subject, we define now the classical notion of discrepancy of a finite sequence 
of points Pn in [0, l)'^. Given Pm — {zi, . . . ,zn} a set of points in [0, 1)'', and a 
nonempty family I of Lebesgue-measurable sets in [0, 1)'', we define the classical 
discrepancy function by 



where cb is the characteristic function of B. This allows us to define the so 
called star discrepancy 

Definition 3.1 We define the star discrepancy D*{Pn) of the point set Pn 
by D*{Pn) := D{l;Pj^), where I is the family of all sub-intervals of the form 
IliLi [Oj "1*1)1 with Ui>0, 1 < i < d. 

The star discrepancy can be considered as a measure of the worst difference be- 
tween the uniform distribution and the sampled distribution in [0, 1)'' attributed 
to the point set Pn- The usual way to analyze QMC as a deterministic method 
is by choosing a class of integrand functions F, and a measure of discrepancy 
D{P]s[) for the point sets Pn- Then, the deterministic integration error is usually 
given in the form 



where V{f) measures a particular variation of the function f (z F- A classical 
particular error bound in this form is the famous Koksma-Hlawka inequality, 
where D{Pn) is taken to be the star discrepancy of the point set Pn, and 
V{f) is the variation in the sense of Hardy and Krause of /. In the context 
of QMC, a sequence of points in [0, 1)'' is called a low-discrepancy sequence if 
D*{Pn) — 0{N~-^ {\og{N))'^) for all truncations of the sequence to its first N 
terms. 



3.1 Quasi— Monte Carlo errors and complexity 

For error analysis of QMC methods, there are certain reproducing kernel Hilbert 
spaces ¥d of functions / : [0,1]'' — R that are particularly useful (see [Tl]). 
Let us denote now with (•, •) and || • || the inner product and norm in F^. A 
reproducing kernel is a function K : [0, 1]'^ x [0, 1]'* — >■ M. satisfying the properties 



D{I;Pn) :=sup 



N 




1. K{-, y) e ¥d for each y e [0, 1]'^ 

2. {f,K{-,y)) = f{y) for each y e [0, 1] 



and / e ¥d 
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If the integral 

/(/) = / fiz)dz 

J[0,l]<i 

is a continuous functional on the space F^;, then the worst case quadrature error 

ew(Fd):= sup \I{f)-QNif)\ 
/eFd,||/||<i 

for point sets Pat — {zi, . . . , z^} and quasi-Monte Carlo algorithms for the 
space ¥d can be given by 

ejv(Fd) = sup |(/, /lAr)| = ||/lAr|| 
II/II<1 

for some hpf E ¥d due to Riesz' representation theorem. In this case, the 
representer /ijv of the quadrature error is given explicitly in terms of the kernel 

by 

1 ^ 

hN{z)= K{z,y)dy~ -y^K{z,z,), V^Gp,!]''. 

Tensor product reproducing kernel Hilbert spaces are of particular interest, since 
the multivariate kernel result as the product of the underlying univariate kernels. 
In QMC error analysis, the weighted (anchored) tensor product Sobolev space 
introduced in [15] is often considered 

d 

Fd = (g)M^2'([0,l]), 

also denoted with = y^2,^{^\[^^^Y)- The weighted norm = (/,/)^ 

results from the inner product 



p p\ I U I pi I U I 

(/,5)7= E W^Jl j—f{zuA)j—g{zu,l)dz^, (9) 

n ~- . J[Q,l]M OZu dZu 



LiC{l,...,(i}iGM 



where for u C {1, . . . , d} we denote by |u| its cardinality, and 1) denotes 
the vector containing the coordinates of z with indices in u, and the other 
coordinates set equal to 1. 

In this case the reproducing kernel is given by 

d 

Kd,^{z,y) = - max(zj,j/j)]) for z,y E [0, l]'^. 

i=i 

The weighted tensor product Sobolev space allow for explicit QMC constructions 
deriving error estimates of the form 

eiv(F<i) < C{6)N-^+' for 5 e (0, i], (10) 

where the constant C (S) is independent on the dimension d, if the sequence of 
weights (jj) satisfies the condition (see [TB] ) 

oo ^ 

E 2(1-5) , 
7/ ' < oo . 
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Traditional unweighted function spaces considered for integration suffer from 
the curse of dimensionahty. Their weighted variants describe a setting where 
the variables or group of variables may vary in importance, corresponding 
to an anisotropic setting. Many integration problems in practice start with 
an isotropic setting but can be modified to an anisotropic setting using a 
proper transformation. This gives a partial explanation of why some very high- 
dimensional spaces become tractable for QMC. 

Explicit QMC constructions satisfying (jlOp are shifted lattice rules for weighted 
spaces. The rate ([TU| can be also obtained for Niederreiter and Sobol' sequences 
(see [IT])- The idea of "weighting" the norm of the spaces to obtain tractable 
results can be applied in fact to more general function spaces than smooth 
function spaces of tensor product form, and many integration examples can be 
found in [H]. In our numerical experiments, we used so far QMC algorithms 
based on a particular type of low-discrepancy sequences. Numerical experi- 
ments with shifted lattice rules will be carried out in the near future, following 
new techniques for fixing adequate weights introduced in |18j . 

4 Low— discrepancy {t, (i)-sequences 

The most well known type of low-discrepancy sequences are the so called (i, d)- 
sequences. To introduce how (t, m, d)-nets and {t, ci)-sequences are defined, we 
consider first elementary intervals in a integer base b > 2. Let E be any sub- 
interval of [0, 1)'' of the form E ~ JliLi [o-ib''^' , (a^ + with a^, q £ N, Ci > 
0, < Qi < for 1 < i < d. An interval of this form is called an elementary 
interval in base b. 

Definition 4.1 Let < t < m be integers. A (t,m,d)-net in base b is a point 
set Pjv of N ~ fe™ points in [0,1)'^ such that every elementary interval E in 
base b with Xd{E) = contains exactly 5* points. 

Definition 4.2 Lett>Obe an integer. A sequence Zi,Z2, ... of points in [0,1)'^ 
is a {t^d)- sequence in base b if for all integers k > and m > t, the point set 
consisting of N — b"^ points 'Li with kb"^ < i < (k + 1)6'", is a (t,m,d)-net in 
base b. 

The parameter t is called the quality parameter of the (t, d)-sequences. In 
[in], theorem 4.17, it is shown that (i, (i)-sequences are in fact low-discrepancy 
sequences. We reproduce this result in the following 

Tiieorem 4.3 The star- discrepancy D* of the first N terms Pn of a (i, d)- 

sequence in base b, satisfies 



where the implied constants depend only on b and d. If either d = 2 or b = 2, 
d — 3,4, we have 



ND*{Pn) < C{d,b)b\log{N)f + 0{b\log{N)Y-^), 




and otherwise 
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Explicit constructions of {t, d)-sequences are available. Examples are the 
generalized Faure, Sobol', Niederreiter and Niederreiter-Xing sequences. All 
these examples fall into the category of constructions called digital sequences. 
We refer to [TS| for further reading on this topic. 

5 Randomized QMC 

There are some advantages in retaining the probabilistic properties of the sam- 
pling. There are practical hybrid methods permitting us to combine the good 
features of MC and QMC. Randomization is an important tool for QMC if we 
are interested for a practical error estimate of our sample quadrature Qn to 
the desired integral. One goal is to randomize the deterministic point set P^r 
generated by QMC in a way that the estimator Qn preserves unbiasedness. 
Another important goal is to preserve the better equidistribution properties of 
the deterministic construction. 

The simplest form of randomization applied to digital sequences seems to 
be the technique called digital b-ary shifting. In this case, we add a random 
shift A G [0, l)"^ to each point of the deterministic set Pn — {zi, z^} using 
operations over the selected ring Ff,. The application of this randomization 
preserves in particular the t value of any projection of the point set (see [S] and 
references therein). The resulting estimator is unbiased. 

The second randomization method we consider is the one introduced by Owen 
(PPj) in 1995. He considered {t,m,d)-nets and (t, c?)-sequences in base b and 
applied a randomization procedure based on permutations of the digits of the 
values of the coordinates of points in these nets and sequences. This can be 
interpreted as a random scrambling of the points of the given sequence in such 
a way that the net structure remains unaffected. We do not discuss here in detail 
Owen's randomization procedure, or from now on called Owen's scrambling. The 
main results of this randomization procedure can be stated in the following 

Proposition 5.1 (Equidistribution) 

A randomized (t, m, d)-net in base b using Owen's scrambling is again a {t, m, d)- 
net in base b with probability 1. A randomized (t, d) -sequence in base b using 
Owen's scrambling is again a {t,d)- sequence in base b with probability 1. 

Proposition 5.2 (Uniformity) 

Let Zi be the randomized version of a point Zi originally belonging to a (t, m, d)- 
net in base b or a (t, d)- sequence in base b, using Owen's scrambling. Then Zi 
has the uniform distribution in [0, 1)'^, that is, for any Lebesgue measurable set 
G C [0, 1)'' , P{zi (z G) — Xd{G), with the d-dimensional Lebesgue measure. 

The last two propositions state that after Owen's scrambling of digital se- 
quences we retain unaffected the low discrepancy properties of the construc- 
tions, and that after this randomization procedure we obtain random samples 
uniformly distributed in [0, 1)''. 

The basic results about the variance of the randomized QMC estimator Qat 
after applying Owen's scrambling to (t, m, (i)-nets in base b (or of (t, (i)-sequences 
in base b ) can be found in |21) . We summarize these results in the following 
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Theorem 5.3 Let Zi, 1 < i < N, be the points of a scrambled (t,m,d)-net 
in base b, and let f be a function on [0, 1)'' with integral I and variance = 
/(/ - Ifdz < oo. Let Qn = J^Zi fi^i)' ^^ere N = 6™. Then for the 
variance V{Qn) of the randomized QMC estimator it holds 

&* 

V{Qn) ^o{l/N), asN-^oo, and V{Qn) < ^ 
For t — we have 

The above theorem says that the variance of a QMC estimator Q n using scram- 
bled (0, m, (i)-nets is always smaller than a small multiple of the variance of the 
corresponding MC estimator. The bound of the theorem above can be im- 
proved (see for example theorem 13.9 in |13)). If the integrand at hand is 
smooth enough, using Owen's scrambling it can be shown that one can obtain 
an improved asymptotic error estimate of order 0{N~2~'s+^^^ for any S > 0, 
see [12]. Improved scrambling techniques have been developed in [23] .124]. 



6 Effective dimensions and sensitivity indices 

In many practical applications, one encounters functions for which the total 
variance is concentrated in a small part of its ANOVA terms. The notion of 
effective dimension of a function was first introduced in |25j to describe the 
contribution of a group of variables to the total variance. 

6.1 ANOVA Decomposition 

Using the ANOVA (Analysis of Variance) decomposition we decompose a func- 
tion into a sum of simpler functions, see [22]. Let D = {!,..., d}. For any 
subset i C D, let |i| denote its cardinality and {D — i) be its complementary 
set in D. Let Zi — {zj : j G i) be the |i|— dimensional vector containing the 
coordinates of z with indices in i. Now assume that / is a square integrable 
function. Then we can write / as the sum of 2'^ ANOVA terms: 

iCD 

where the ANOVA terms /'(x) are defined recursively by 

f\z)^ [ f{z,,ZD^i)dzD-i -y2fKz), 

and = Hf)- The sum of the right-hand side is over strict subsets j ^ i, 
and we use the convention J^^ f{z)dz0 = f{z). The ANOVA terms enjoy 
the following interesting properties: 

1. f{z)dz, = for J e i. 
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2. The decomposition is orthogonal, in that J^^^ f\z)P{z)dz = when- 
ever i 7^ j. 

3. Let <J^{f) = /jQ f{z,Y dz — {I{f)Y be the variance of /, then we have: 

for |i| > is the variance of /' and (J%{f) — 0. 
Definition 6.1 

1. f is said to have effective dimension in the superposition sense dg with 
proportion p, for < p < 1, if ds is the smallest integer that satisfies 

2. f is said to have effective dimension in the truncation sense dt with pro- 
portion p, for < p < 1, if dt is the smallest integer that satisfies 

E ^i(/)>p'^'(/)- 

iC{l,...,dt} 

One can estimate the effective dimension in truncation sense based on the 
algorithm proposed by Wang and Fang ^7}. They show that the following 
equality holds 

/ f{z)f{z^,yD-u)dzdyD-u = E^i (/) + 

Thus, for estimating the effective dimension in truncation sense, we need to 
estimate the following tree type of integrals 

/ f{z)dz, / f\z)dz, / 

y D — U )dzdyD-u, (11) 

J[0,l]<i "'[0,1]'* J[0,l]2''-I"l 

for u = {1, . . . , Z}, I — 1,2, ... , using MC or QMC, until the proportion of 
variance defining the effective dimension is reached. In many applications, the 
proportion value is usually taken as p = 0.99. For problems exhibiting low effec- 
tive dimensions with an effective part of the function belonging to the weighted 
tensor product Sobolev space, one can expect that QMC methods outperform 
MC for high-dimensional integration. If the truncation effective dimension is 
small, than few variables are important for sampling. If the superposition effec- 
tive dimension is small, say dg = 2 or maybe d — 3, than some QMC sequences 
are also expected to outperform MC, because they can exhibit more equidis- 
tributed low dimensional projections than MC. 

The ordering of the variables of the integrand is important for achieving a re- 
duction of the effective dimension in the truncation sense dt , and usually affects 
the performance of QMC in practice. Sensitivity indices usually help to order 
the variables in a convenient way for integration. 
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6.2 Derivative based sensitivities 



As pointed out by Sobol' and Kucherenko in [55], very often derivative based 
measures of sensitivities can successfully be used for detecting non essential 
variables. Small values of first order derivatives of a function implies small 
values of one-dimensional total Sobol' sensitivity indices. Let a?{f) denote the 
partial variance corresponding to the AN OVA term Define 

iCD-.jei 

then it is shown in and f55^ that 



1 



1 



^h}ifY°' = ^ I I [f{z)-f{z,,---,Z,^,,z'^,Z, + ,,---,Zr^)]^dzdz'^, 

from which one obtain the following two results: 



1. if c < \§^\ < C, then 



2 (-(2 
^ ^2 / f\tot ^ ^ 



2. and if ^ e L2([0, 1]''), then 



As a consequence of the bounds stated above, the total variance corresponding to 
non-essential variables of a function can be bounded using first order derivatives 
information. In a wide variety of problems in practice, the gradient of a scalar 
function can be efficiently computed trough algorithmic differentiation (see [29]), 
at a cost at most 4 times of that for evaluating the original function. Thus, a 
cheap method for estimating derivative based sensitivities, and an upper bound 
on the effective dimension in the truncation sense (as stated in the following 
simple Proposition), may be available using algorithmic differentiation. The 
variance is, clearly, invariant to a permutation of the variables. This allowed us 
to consider the following 

Definition 6.2 Given an bijection (permutation) tt : {1, . . . ,d} {1, . . . ,d}, f 
is said to have tt -effective dimension in the truncation sense dt with proportion 
p, for Q < p < \, if dt is the smallest integer that satisfies 

iC{77-i(l),...,7r-i(dt)} 

Proposition 6.3 Let J e L2{[0,1]'^) such that |f e L2{[0,1]'^) ^ ^ < j < n. 
Consider the derivative based sensitivities 

Vi:= ^ [ dz, 
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and consider any permutation tt* : {!,...,(?} {I, . . . ,d} such that 

(a non-increasing ordering of the sensitivities vi 's, resulting in what is called by 
the authors a "Dijf-decay-ordering" n*). 

Let < p < 1 be a fixed proportion parameter. If there exists an integer m such 
that 

d 

E V^,,y^(,■^<{l~p)o\f) (13) 
J— m+1 

then, it follows that the it* -effective dimension in the truncation sense with 
proportion p is at most d — m. 

Proof. 

Let dt denote the tt* -effective dimension in the truncation sense with propor- 
tion p. Consider m satisfying ([13]) and define = {i : i C {(7r*)^^(l), (7r*)^^(m)}} 
and /t„ = X^ieT /'(-^)- follows from the L2 orthogonality of ANOVA de- 
composition and (jl2p that 

{i:iC-DAi^T,„} j=m+l {icr':(7r*)-i0)6i} 

= E 4^-)-o)}(/)"* ^ E -(..)-o)<(i-?'V'(/)- 

j—m+l j—m+1 

It follows (j'^ifTm) ^ P<^^if) and thus dt < m, what was required to be proved. 

□ 



7 Weighted uniform sampling 



In this section we will discuss the method we used to approximate observables as 
they are defined in equation Before the weighed uniform sampling (WUS) 
method can be applied to this expression, it is necessary to perform a transfor- 
mation of the variables Xi to the d-dimensional unit cube, [0, l]'^. In the cases 
we will consider in section [8l this transformation will always be of the form 



(14) 



with A being a positive definite matrix and $ ^ the inverse of the PDF of the 
standard normal distribution. After the transformation equation ([7]) reads: 



(O) 



J^„^^^,0{A<i>-^z))W{z)dzi...dzd 



W{z) 



exp 



---dzd 

-S{A'^-\z)) + W{'f-\z,)f 



(15) 
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Now, in the WUS method points Zj, I < j < N, are generated from a uniform 
distribution in [0, 1]"^. Using these points, a quotient of integrals of the form 

^ ^ /[04]./i(^)rf^ 

/[0,i]d/2(^)c?2 

can then be obtained by taking the estimator 

A Ef=i/i(^j) , . 

— Af , (16) 

Ej = l/2(^j) 

where the functions fi could be of very general, in particular non-Gaussian 
nature. For our example these functions can be read off from equation (1151) : 
/i = 0{A<i>-^{z))W{z) and /a = W{z). For the case that W{z) is really a 
function of z (and not just a constant), this way of evaluating integrals over 
certain weight functions W is known as reweighting technique in field theory or 
statistical physics. A crucial element of the WUS (reweighting) method is that 
the sampling points have a large enough overlap with the weight functions fi 
considered. The estimator in ll6l was analyzed in [TD] and applications have been 
investigated for example in |30] and [31] . The bias and the root mean square 
error (RMSE) of this estimator satisfy 

B^asiR^) = ^^^^ - S^lliMA + o(A.-i) 

RMSEiR.) . V^'arih)+R^varih)-2RcovifU,) ^ 

V N 

The bias of the estimator is asymptotically negligible compared with the RMSE. 
One clear disadvantage of WUS against Mc-MC or Importance Sampling for 
problems with large regions of relative low values of the integrands is that with 
WUS we sample over the entire unit cube [0, 1]** uniformly, thus the method is 
dependent on how we transformed the problem to the unit cube. In contrast, 
Mc-MC and Importance Sampling based techniques for models in high-energy 
or statistical physics usually focus on characteristic or important regions of the 
integrands aiming to sample directly from the underlying distribution of the 
problem, using in this way only the most relevant sample points. 



8 Numerical experiments 

We consider for our numerical tests the quantum mechanical harmonic and an- 
harmonic oscillator in the path integral approach as described in section 2. For 
definiteness we repeat here the expression for the action of the system: 



(17) 



We investigate the two observable functions 

d 



1=1 i=\ 
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using the notation for (Oi(x)),(02(x)) in our tests. In addition, we 

will look at the ground state energy Eq which, by virtue of the virial theorem, 
is related to d and O2 by Eq = fi'^Oi + 3XO2 + j^. 

8.1 Harmonic Oscillator 

For the harmonic oscillator we can apply immediately the direct sampling ap- 
proach described in sections [3] and [7] for calculating estimates of observables O 
by setting 

/l =0(^$-l(^)) , /2 = 1 

in (fTC]) . The matrix A is a square root of C, the covariance matrix of the 
variables Xi, appearing in the action if it is expressed as a bi-linear form: S = 
^x'^C~^x, written explicitly 



^-1 , 2Mo 



9 9 

^ . (18) 



Different factorizations, namely Cholesky and PCA (principle component anal- 
ysis) have been tried out. The PCA based factorization turned out to perform 
better in our tests, which is the reason why we will only show results for this 
method. The PCA factorization can be explicitly obtained for circulant Toeplitz 
matrices and the matrix-vector products can be efficiently computed by means 
of the fast Fourier transform. Given that the covariance matrix C is circulant 
Toeplitz, we have that C = GAG'^ , with G := Re{F) + Im{F), 

{F)u ^ ^e-^« (19) 

being the Fourier matrix and A the diagonal matrix of positive eigenvalues 
(Lemma 4 in [32])- Thus A = GA2 is a factorization of C, and in this case one 
can follow a recipe for generating normals with randomized QMC based on the 
discrete Fourier transform and using fast Fourier transform (FFT) techniques 
as described in 1321: 



1. Generate a randomized QMC point z. 

2. Compute y = <i>^^(i;). 

3. Compute w = {VNy7r-^i), y%.y7,-^d)), where 



ft- 



(u — cos(27rj/d)) 



, 1 < j < (20) 



are the eigenvalues in the diagonal matrix A, and 7r(.) is a fixed permuta- 
tion of the variables. 

4. Compute V = FFT(W). 

5. Take x = i?e(v) + /m(v) as the resulting point sample. 
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We reorder the sampling points in step 3 in order to be able to apply standard 
FFT. It is (strongly) recommended to fix first the permutation 7r(.) such that 
(/5ir(j))j=i in non-increasing order, and this permutation was taken in our 
experiments. If this permutation of variables does not lead to satisfactory re- 
sults, the analysis described in l6.2l can be carried out to investigate if a possible 
different permutation leads to more effective dimension reduction and better 
results. 

In the ordinary Mc-MC approximation, we used the Mersenne Twister }11| pseudo 
random number generator. We note in passing that the Mc-MC samples were 
generated in exactly the same way as described above for QMC with the only dif- 
ference that in step one Mc-MC points were generated according to the Gaussian 
measure of the harmonic oscillator. This corresponds to a heatbath algorithm, 
where all variables Xi are updated at the same time and a reweighting procedure 
in the anharmonic case. For the QMC tests, we use randomly scrambled Sobol' 
sequences using the technique proposed by J. Matousek[23]. For a description 
of how to implement Sobol sequences, we refer to [33] and [34]. Moreover, one 
can find a three-pages-note with a short and simple description on the web- 
site http://web.matlis.unsw.edu.au/~fkuo/sobol/index.html. The error of 
(X^) was obtained by scrambling 10 times the QMC sequence and making 10 
runs of an Mc-MC simulation (with different seeds). This procedure is repeated 
30 times in both cases to obtain the error of the error. From the results shown in 
figure [TJ we can see a scaling of the errors for both methods that agrees perfectly 
with the expected behavior, namely N~^-^ for Mc-MC and for QMC, for 
large N . Although this example is trivial, it was our first successful application 
of the QMC approach in a physical lattice model and motivated us to pass on 
to more complicated models. 

8.2 Anharmonic Oscillator 

The WUS (reweighting) approach was also used for the problem of the anhar- 
monic oscillator to estimate (X''), {X"^) and the ground state energy of the 
system (£^o)- With the anharmonic term in action, the probability distribution 
function (PDF) of the variables Xi is of non-Gaussian nature and hence becomes 
very complicated. This makes it very hard to generate the samples directly from 
the PDF of the anharmonic oscillator. Instead of this, the anharmonic term and 
a fraction of the harmonic term is treated as part of the weight functions /i and 
/2 in (jl6p . leaving the sampling procedure of the Xi basically unchanged as 
compared to the harmonic oscillator. For the weight functions in [15] we have 
chosen 

h{z) = 0{x)h{z) , h{z)=e^\\ 'J' 'J, (21) 
X = A^-^{z) . 

For the evaluation of x, not the original parameter fi^ in equation (jl8|) was 
taken but a shifted parameter /x^j^ which is exactly compensated by the weight 
function /2. This procedure is necessary, because of C = A^A being positive 
definite only if ^^^^ > 0, whereas /Lt^ < in our test cases. Thus, we have to 
shift the value of fi in the harmonic piece of the action to guarantee positivity. 
Besides the requirement of positivity, one is free in the choice of the parameter 
Hsim- In fact, it turned out that one can take advantage of this freedom to 
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Error of <x^> for the Harmonic Oscillator 




Number of samples - N 



Figure 1: The error of {X'^) in dependence on the number of 
samples N . The parameters here were chosen as A = (har- 
monic osciUator), d = 51, Mq = 0.5 and fi^ = 2.0. The error of 
the error was obtained by repeating the numerical experiment 
30 times, see also the text. 



tune Hsim to a value that reduces the fluctuations of the weights. This leads to 
observable averages with less variances and therefore smaller errors. Further, it 
is important to note that the PCA factorization during the generation of the 
Gaussian samples is essential for an efficient reduction of the effective dimension 
(see [15]) of the problem. For the parameters listed below, we estimated the 
effective dimensions of the functions (PT|) to be close to 20 (for a 99% variance 
concentration), for estimating the integrals described in (jlll) with the dimension 
of the original system up to c? = 1000. Thus, we observe a drastic reduction of 
dimensionality. 

On the other hand we found that the effective dimension depends very 
strongly on the parameter T = da, i.e. the physical time extent of the system. 
For small T- values, say T < 0.2, the effective dimension is reduced sufficiently 
good, such that the QMC approach leads to an 1/A^ error scaling. The situa- 
tion changes for T = 1.5, where the effective dimension is significantly increased 
and, correspondingly, the error behaves like 1/7V" with a « 0.75 only, see below. 
Tests with values of T > 5 indicate that the simulations become more and more 
difficult in the sense that one needs more and more samples to achieve the same 
accuracy of an observable as compared to estimates at T — 1.5. Thus, in such 
situations the overlap of the sampling points with the functions fi in [16] seem to 
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o 


a 


logC 


xVdof 


d= 100 




-0.763(8) 


2.0(1) 


7.9 / 6 






-0.758(8) 


4.0(1) 


13.2 / 6 






-0.737(9) 


4.0(1) 


8.3 / 6 


d = 1000 


X^ 


-0.758(14) 


2.0(2) 


5.0 / 4 




x^ 


-0.755(14) 


4.0(2) 


5.7 / 4 






-0.737(13) 


4.0(2) 


4.0 / 4 



Table 1: Shown are the resuhs for fit parameters of the error scahng for the 
observables considered, i.e. X^ , X^ and i?o, where the used fit function takes 
the form error = CN"-, see equation (25] We also provide the values as well 
as the number of degrees of freedom in the fit (dof). 

be too small to reduce the fluctuations sufficiently. However, we are presently 
exploring a more general approach for selecting an optimal matrix C (resp. A 
in (jl4p ) in the sampling procedure to improve the situation for larger values of 
T. 

For our numerical experiments, the parameters were set to Afg — 0.5, A — 
1.0, jj? = —16. In the two tests the lattice spacing a was adjusted such, that 
T was kept fixed. The optimal value of /x^^^ generally depends on all physical 
parameters of the system and in particular on a. Thus, we have to adjust also 
ptsim when the lattice spacing a is changed. In particular, we set a = 0.015 and 
^2^^ ^ 0.176 for d = 100, whereas for d = 1000 a = 0.0015 and ^i^im = 0.2 
was chosen. The error analysis of {X'^) and {X'^) was carried through in the 
same way as described for the harmonic oscillator test case discussed in the last 
subsection. 18.11 We show in figure [5] the error of {X'^) and Eq as a function of 
the number of samples. In addition, we represent by the dashed line in figure [5] 
a fit to the data for the computed errors using the formula 

log ( Error ((O))) =logC + alogiV ; O = {a;^, x^ -Eq} , (22) 

with C and a left as free parameters. From this analysis we can obtain a 
quantitative determination of the exponent of the error scaling. The results for 
the fit parameters are listed in table [T] 

As can be inferred from table [T] in the case of the anharmonic oscillator the 
error scaling exponent is not the optimal one of a = 1 but assumes a value of a w 
0.76. However, this constitutes still a much improved error scaling compared to 
a Mc-MC simulation with a corresponding large gain in the number of required 
samples to reach a desired accuracy. Moreover, the value of a is consistent 
for all observables considered here and independent from the dimension of the 
problem employed, a finding which is clearly encouraging. 

We finally mention that the resulting estimates of the ground state energy 
matches in at least two significant digits with the theoretical value, Eq — 3.863, 
calculated in [3S], namely Eq = 3.857±0.004 for d = 100 and Eq = 3.862±0.004 
for d = 1000. 
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9 Concluding Remarks 



In this article we have performed a first apphcation of the QMC method to 
Euchdean lattice models. The goal was to see, whether the QMC algorithm 
provides also in the case of non- Gaussian systems an improved error scaling 
behaviour with respect to Markov-chain Monte Carlo methods. As a prototype 
system, we have considered the quantum mechanical oscillator discretized on a 
Euclidean time lattice, both in its harmonic (Gaussian) form as well as adding a 
non-Gaussian quartic term (anharmonic oscillator). For the harmonic oscillator 
we found a large- iV [N being the number of sample points) error behavior as 
expected, i.e. for QMC (~ 1/N) and for Mc-MC (~ 1/v^). 

The main result of our investigation is that also for the anharmonic oscillator, 
which is a non-Gaussian problem, the QMC approach leads to a significant 
improvement of the error scaling with a w —0.76, see table [T] for the exact 
values of a for different observables and different physical situations. 

Further, we found that the choice of the lattice spacing a does not seem 
to have any effect on the error scaling behaviour and the accessible range of 
T — 1.5 values gives already estimates of the ground state energy, compatible 
(within errors) with the theoretical prediction (valid in the limit T — oo and 
a — > 0). For the case that the improved error scaling and the mild dependence 
on the lattice spacing a found here will also be present in more elaborate models, 
the QMC has the potential to become very valuable in the future. On the other 
hand we observed that the applicability of the WUS (reweighting) approach 
seems to be limited by the physical time extent T — da of the system. Stable 
results could only be found for values T < 1.5. 

It is clear that the here considered quantum mechanical systems are rather 
simple models and still a long way has to be gone, if generic quantum field 
theories, especially gauge theories are to be studied. Nevertheless, it is very 
reassuring that we find an improved error scaling behaviour in the case of a 
quartic potential and hence a non-Gaussian system. This promising result is 
certainly a strong motivation for studying further QMC methods in lattice field 
theories and statistical mechanics. 
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Figure 2: We show the error of the observables {X'^) and Eq as a function of the 
number of samples in a double logarithmic graph. The error of the error was 
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randomly scrambled Sobol' (RQMC) was used with 2", 2^6, 2^'^, and 
2^^ points. The dashed line shows the fit to the data points using a fit function 
log(A(0)) - log(C) + alog(A^). The fitted exponents are a = -0.758(14) for 
(X^) and a = -0.737(13) for Eq, see also tabled] 
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